Breaking Bad is an American crime drama created by Vince Gilligan. It aired on AMC from January 20, 2008 to September 29, 2013. The series follows Walter White, a high-school chemistry teacher turned meth manufacturer, and his former student Jesse Pinkman.
I begin by scraping the episode tables from Wikipedia and performing basic cleaning so that all columns align properly.
# 1) Scrape data from Wiki
url <- "https://en.wikipedia.org/wiki/List_of_Breaking_Bad_episodes"
page <- read_html(url)
tables <- page %>% html_nodes("table.wikiepisodetable") %>%
html_table(header = 2, fill = TRUE)
tables_ep <- keep(tables, ~ "No.overall" %in% names(.x))
tables_clean <- map(tables_ep, function(tbl) {
names(tbl) <- make.names(names(tbl))
tbl[] <- lapply(tbl, as.character)
tbl
})
df_raw <- bind_rows(tables_clean, .id = "Season") %>%
mutate(Season = as.integer(Season))
vcol <- grep("viewers", names(df_raw), ignore.case=TRUE, value=TRUE)[1]
# Strip out citations and parse numeric
df_raw <- df_raw %>%
mutate(
Viewers = as.numeric(str_remove_all(.data[[vcol]], "\\[.*?\\]"))
)
# Filter only to seasons 1–5
df <- df_raw %>%
transmute(
Season = Season,
EpisodeOverall = as.integer(str_extract(No.overall, "\\d+")),
Title = Title,
Viewers = Viewers
) %>%
arrange(Season, EpisodeOverall) %>%
filter(Season <= 5)
Quick look at the cleaned data:
knitr::kable(head(df), caption="*Table: First 6 episodes of the cleaned dataset*")
| Season | EpisodeOverall | Title | Viewers |
|---|---|---|---|
| 1 | 1 | “Pilot” | 1.41 |
| 1 | 2 | “Cat’s in the Bag…” | 1.49 |
| 1 | 3 | “…And the Bag’s in the River” | 1.08 |
| 1 | 4 | “Cancer Man” | 1.09 |
| 1 | 5 | “Gray Matter” | 0.97 |
| 1 | 6 | “Crazy Handful of Nothin’” | 1.07 |
Data summary:
summary(df)
## Season EpisodeOverall Title Viewers
## Min. :1.000 Min. : 1.00 Length:64 Min. : 0.970
## 1st Qu.:2.000 1st Qu.:16.25 Class :character 1st Qu.: 1.323
## Median :3.000 Median :31.50 Mode :character Median : 1.650
## Mean :3.344 Mean :31.50 Mean : 2.236
## 3rd Qu.:5.000 3rd Qu.:46.75 3rd Qu.: 2.268
## Max. :5.000 Max. :62.00 Max. :10.280
## NA's :2 NA's :2
Next, I compute season‐level summaries: number of episodes and average viewers.
Plot 1 shows viewership for each episode across all five seasons. Notice the clustering and spread.
p1 <- ggplot(df, aes(x=EpisodeOverall, y=Viewers, color=factor(Season))) +
geom_line() +
geom_point() +
labs(
title = "Viewers per Episode (All Seasons)",
x = "Episode (Overall)",
y = "Viewers (millions)",
color = "Season"
) +
theme_minimal(base_size=13)
ggplotly(p1, tooltip=c("x","y","color")) %>%
layout(hovermode="closest") %>%
config(displayModeBar=FALSE)
Plot 2 shows the average viewership per season, highlighting an overall upward trajectory.
p2 <- ggplot(season_summary, aes(x=Season, y=Avg.Viewers)) +
geom_line(size=1.2, linetype="dashed", color="#205237") +
geom_point(size=4, color="#205237") +
labs(
title = "Average Viewers per Season",
x = "Season",
y = "Viewers (M)"
) +
theme_minimal(base_size=14)
ggplotly(p2, tooltip=c("x","y")) %>%
layout(hovermode="x unified") %>%
config(displayModeBar=FALSE)
Plot 3 illustrates the change in average viewership between consecutive seasons.
season_summary <- season_summary %>%
mutate(Delta = round(Avg.Viewers - lag(Avg.Viewers), 2))
p3 <- ggplot(filter(season_summary, Season > 1),
aes(x=Season, y=Delta, text=paste0("Δ=",Delta," M"))) +
geom_col(fill="#2d4d00", alpha=0.8) +
labs(
title = "Change in Average Viewers Between Seasons",
x = "Season",
y = "Δ Viewers (M)"
) +
theme_minimal(base_size=14)
ggplotly(p3, tooltip="text") %>%
config(displayModeBar=FALSE)
Breaking Bad shows a clear growth pattern:
From the modest 1.23 million average viewers in Season 1, the audience steadily increased each year, climbing to 1.31 million (Season 2), 1.52 million (Season 3), and 1.87 million (Season 4). The largest leap occurs between Seasons 4 and 5, with an increase of 2.45 million viewers to reach an impressive 4.32 million average viewers.
This trend reflects how the series gained momentum, drawing in more viewers, culminating in a finale that became a cultural phenomenon.
Final Thought:
Breaking Bad’s Breaking Bad’s escalating viewership mirrors its narrative arc, starting small, intensifying mid run, and exploding at the climax.